2024 Summer 
Datathon Report 


== CIIADEL 


Authors: 
Matthew Chen, Sophia Zhu, Nicole Qiu, Alicia Ji 


Team 5 


Non-Technical Executive Summary 


Our team conducted a multifaceted analysis examining the intersection of the American 
indulgent food culture with public health and economic outcomes. We focused on dis- 
cerning the direct and indirect costs imposed by this culture on healthcare and economic 
systems. 


Our investigation utilized comprehensive datasets spanning several decades to trace cor- 
relations and potential causative relationships between dietary habits and health-related 
economic impacts. By analyzing data on meat and sugar production and their correspond- 
ing health implications such as diabetes and obesity, we aimed to quantify the broader 
economic consequences. 


Key findings from our analysis highlighted significant correlations between increased meat 
and sugar consumption and heightened healthcare expenditures due to diseases like di- 
abetes. These trends not only reflect on individual health but also illustrate broader eco- 
nomic implications, affecting national healthcare spending and influencing economic poli- 
cies. 


Moreover, our research provided insights into how the indulgent food industry could drive 
certain economic sectors, revealing potential predictive relationships between food pro- 
duction trends and stock market behaviors. This comprehensive approach allowed us to 
present a holistic view of the economic and health impacts of dietary choices in the United 
States, underscoring the need for informed policy-making to mitigate these effects. 


This executive summary encapsulates our findings, offering a clear understanding of the 
complex dynamics at play between diet, health, and economy, and emphasizes the im- 
portance of strategic interventions to foster a healthier future. 


Contents 


Non-Technical Executive Summary 1 
1 introduction 4 
4 

dt Demmitiond .................................. 4 

.1.2 Indulgent Food and Health ......................... 4 

4 

1.2.1 Definitions. .................................. 4 

‚2. conomics and Indulgent Food Culture .................. 5 

7 
ZI Economic Сһагасїеп5Шс5.............................. 7 
[DL а ӨТТЕ (ЕТУУ ьо one. SK RR жой кок ee 7 

SER a BR aE EE аю а я» шок за Ж a я 7 

2.1. ummar ALISUCS `2 5 алаа жаа а E Ro EK жю RR m Я 8 

2.1.4 Time Series Analysis]... . еее еее 9 

2.2 Nutrition, Physical Activity, and Орез5їїМ...................... 11 
22.1 Overview ................................... 11 

in dui E e Be E Bite ee 11 

2.2. ime б5епесАпаїуз$............................. 12 

P3 Meat РГОЧИСШОГІ Е й З ae шж "I 15 
PZT OV IG "T "e 15 
Ridi did Gp Gt Ae BG cits he A GL о о К Bee 15 

2.3. ime Series Analysis). . . .... ...... rre. 15 

2.4 Sugar and Coffee Commodities .......................... 16 
241 Overview ................................... 16 

2.4.2 Data Processing) ............................... 17 

2.4. ime Series Апа/уз©............................. 17 

5 Stocks ап ELE -is 63x33 xx ж бк 9 юж ШО ж SESE a жа 17 
[.21-DUSPVIEWM 5-5 xo; И кию bees кка мякі 17 

2 о уйы й бю Я ЕБ Ж фар в б ов зов we soe ой яки 17 

2.5. ummar ALISUICS «зако ken xo Rx ox OR com we ee йы Ж ж Rr Д 18 
2.р.4_Ште5епе<Апау5$............................. 18 

2.6 Cross-Dataset Correlationg . . . o.oo а. 19 
3 Health 21 
3,1 парегед©....................................... 21 
3.1. ime Series Analysis . .......... еее еее 21 

3.1.2 Diabetes and Sugar Prices] ......................... 22 

22 

23 

25 
4. ОСК5 s and Indulgent Food Есопотеїїїс].................. 25 
4.1. ignal Preprocessing). ............................ 25 

й. 1.2 Meat Production ............................... 25 

4.1. offee and Sugar Ргісе5|........................... 28 

4. conometrics and Business Cycles ........................ 29 
ЕЕ 29 


Д.З тПайопапатрїсїСо5]............................. 
4.3.1 Correlation ої Commodity Consumption апа Consumer Baske ..... 
Hd Geolocalo. «ex ав канааа He ee Rae Feed we Ree Ж ыз 

о Conclusion 


5.2 Recommendation to the United States Government 


1 Introduction 


In a country where culinary indulgence often equates to greasy burgers, sugary desserts, 
and high-calorie beverages, the impact of dietary choices on health and the economy is 
increasingly scrutinized. With obesity rates climbing to unprecedented levels, the associ- 
ated costsboth implicit and explicitare of significant concern. 


This report aims to explore these costs through a comprehensive analysis of health-related 
expenses and economic implications tied to dietary habits in the United States. 


1.1 Health-Related Externalities 
1.1.1 Definitions 


1. Obesity: medical condition characterized by excessive body fat that increases the 
risk of various health problems. It is typically measured using the Body Mass Index 
(BMI), where a BMI of 30 or higher is classified as obese. 


2. Diabetes: chronic disease that occurs when the body either cannot produce enough 
insulin (Type 1 diabetes) or cannot effectively use the insulin it produces (Type 2 
diabetes). Insulin is a hormone that regulates blood sugar. 


3. Red Meat: meat from mammals, such as beef, pork, and lamb. It is a significant 
source of protein and nutrients but is often associated with higher saturated fat con- 
tent. 


4. Poultry: meat from domesticated birds such as chickens, turkeys, and ducks. Poul- 
try is generally lower in saturated fat compared to red meat and is considered a 
healthier protein source. 


1.1.2 Indulgent Food and Health 


The annual medical costs associated with obesity are staggering, burdening both individu- 
als and the healthcare system. Utilizing external datasets, we investigated the correlation 
between obesity rates and medical expenses. We conducted the analysis on the health 
impacts of meat consumption, as well as sugar and coffee intake, and how they relate to 
obesity and other health problems. By understanding these correlations, we can better 
grasp the broader health implications of prevalent dietary trends. 


1.2 Economic Costs and Benefits 
1.2.1 Definitions 


1. Stocks/ETFs: a collection of securities that include investment funds and exchange- 
traded products. Ownership of a stock represents a claim on part of the corporation's 
assets and earnings. High stock prices indicate strong investor confidence, while 
falling prices may signal concerns about a company's future prospects. 


2. Business Cycles: fluctuations in economic activity that an economy experiences over 
a period, which consist of expansions (periods of economic growth) and contractions 


(periods of economic decline). 


THE ECONOMIC CYCLE 


REAL GDP 


Business 


Recessionary 
Trough 


TIME 


3. Unemployment Rate: percentage of the labor force that is jobless and actively seek- 
ing employment. Unemployment rate is an indicator of economic health. 


4. Inflation: rate at which the general level of prices for goods and services rises, erod- 
ing purchasing power. 


5. Consumer Basket: collection of goods and services used to measure the average 
change in prices over time. The consumer basket typically includes items such as 
food and healthcare. 


1.2.2 Economics and Indulgent Food Culture 


The economic consequences of dietary choices are multifaceted. This section of the re- 
port focuses on identifying the correlation between processed food stocks and food pro- 
ductions/prices throughout the most recent two decades. We explore whether meat pro- 
duction yields have predictive power to relevant restaurant stock prices. We also explored 
the correlation strength between coffee/sugar prices and relevant beverage companies’ 
stock prices. 


In addition, we assessed the causal relationship between meat production and unemploy- 
ment rates during different business cycles, gaining insights from the producer’s perspec- 
tive. 


Further, we conduct a time series analysis on the inflation of coffee and sugar product 
prices, examining how these trends align with sugar and coffee consumption within the 
consumer basket. The report also considers the geographical and economic disparities in 


5 


obesity rates, offering a regional perspective on the economic status and its correlation 
with dietary habits. 


2 Exploratory Data Analysis 


In this section, we provide a brief overview of the relevant datasets we work with, pro- 
viding summary statistics and both univariate and multivariate analysis to draw insights 
from the data. We briefly describe our data processing approach, along with any insights 
gleaned. Lastly, we analyze the relationships between features relevant to assessing the 
true costs associated with the United States’ gluttonous lifestyle. 


2.1 Economic Characteristics 
2.1.1 Overview 


The Economics dataset provides comprehensive information on key economic indicators 
across different regions of the United States from 2010 to 2022. It includes data on var- 
ious socioeconomic indicators for each state in the U.S. from 2010 to 2022. Namely, for 
every state in each year, there is an estimate for a subset of the population in categories 
regarding employment, work commute, socioeconomic class, occupation, industry, health 
insurance coverage, and income & benefits. 


These various socioeconomic indicators can provide a reliable basis for diagnosing the 
financial health of the nation in recent years, which in conjunction with other data can 
analyze trends of the economic costs of America’s indulgent lifestyle. 


2.1.2 Data Processing 


This economic data was preprocessed to focus on key indicators. We cleaned the dataset 
by 


e Removing unncessary columns and rows. In total, there are 8 categories and 135 
subgroupings, many of which may not be relevant to our question at hand. More- 
over, a quick analysis shows there is no non-missing data on health insurance cover- 
age. Columns and rows like these are unlikely to yield relevant insight. Namely, we 
filter rows with missing estimate values and keep columns related to employment, 
income, and poverty. 


* Convert estimate values to floats. The dataset format are strings for all numeri- 
cal data, such as estimate, % error, etc. We process these to convert to floats for 
numerical analysis. 


* Group states into Midwest, South, West, and East. With the assumption that various 
similar regions of the United States are likely to exhibit similar trends and character- 
istics, we group them together. We group states into the following conventional four 
regions according to the US Census [Ш]: 


UNITED STATES CENSUS REGIONS AND DIVISIONS 


MIDWEST 


| 


NORTHEAST 


WEST NORTH CENTRAL 


SOUTH 


MOUNTAIN 


"SOUTH 


WEST SOUTH CENTRAL 


CENTRAL 


This preprocessing allows for a more focused analysis of regional economic trends 


over time related to the financial health of Americans. 


2.1.3 Summary Statistics 


We plot summary statistics regarding unemployment rate, median household income, and 


poverty rate for each region in the following box plots below. 


Unemployment Rate Distribution by Region 


Unemployment Rate (%) 


Midwest South West East 


[t 


Median Household Income Distribution by Region 
. 


100000 


. 
. 
* 


90000 


80000 


70000 


60000 


Median Household Income ($) 


50000 


40000 


Midwest South West East 
Region 


Poverty Rate Distribution by Region 


22 
2 
18 
16 
14 
12 
10 


Midwest South West East 
Region 


Poverty Rate (%) 


Analyzing the data, the South exhibits the worst financial health out of any other region 
in the nation, having the highest average unemployment rate, lowest median household 
income, and highest average poverty rate. 


2.1.4 Time Series Analysis 


We plot the financial metrics data against time to analyze potential trends over the past 
15 years. 


[= 


Average Unemployment Rate Over Time by Region 


Unemployment Rate (%) 
4444, 
с 
THE 


2010 2012 2014 2016 2018 2020 2022 
Year 


Average Poverty Rate Over Time by Region 


17 


—9— East 
—®- Midwest 
— South 
16 —°- West 
15 
= 
2 
© 
c 14 
г 
Ф 
8 
a 13 
12 
11 
2010 2012 2014 2016 2018 2020 2022 
Year 
Average Median Household Income Over Time by Region 
85000 
80000 


75000 


55000 


Median Household Income ($) 
s 3 8 
8 


50000 


2010 2012 2014 2016 2018 2020 2022 


Analyzing the unemployment rate, all regions showed a general decreasing trend from 
2010 to 2022. Moreover, their trends seemed to be in parallel, suggesting while the South 
may consistently have higher unemployment rates, the effects of market factors respon- 
sible for unemployment are not segregated to any region. 


10 


Poverty rate trends followed а similar pattern to unemployment in every region, with rates 
peaking around the early-mid 2010s and decreasing since. Again, these trends being con- 
sistent across regions suggest regional factors (such as differing politics, infrastructure, 
culture) are likely not responsible for differing financial health trends. 


All regions showed an upward trend in median household income over the 12-year pe- 
riod. Despite unemployment and poverty peaking in the early to mid 2010s, there seems 
to be no financial distress on median household income in that time frame. 


2.2 Nutrition, Physical Activity, and Obesity 
2.2.1 Overview 


The Nutrition, Physical Activity, and Obesity Data dataset provides comprehensive health- 
related information across various U.S. states and territories. It includes data on adult and 
adolescent obesity rates, physical activity levels, and dietary habits. The dataset aims to 
track public health trends related to nutrition and physical activity, offering insights into 
the health status of different populations across the country. 


2.2.2 Data Processing 
We process this dataset for our purposes in the following key ways: 


* Filter the dataset to focus on six questions related to obesity. Namely, the six ques- 
tions we choose are 
1. Percent of adults aged 18 years and older who have obesity 
2. Percent of students in grades 9-12 who have obesity 
3. Percent of adults who engage in no leisure-time physical activity 
4 


. Percent of students in grades 9-12 who achieve 1 hour or more of moderate- 
and/or vigorous-intensity physical activity daily 


сл 


. Percent of adults who report consuming vegetables less than one time daily 


6. Percent of students in grades 9-12 who consume vegetables less than 1 time 
daily 


Questions 1, 3, 5 address the obesity, physical activity, and diet of adults, whereas 
questions 2, 4, 6 address the same for children. As a result, these questions give us 
a more complete diagnosis of the physical health of Americans. 


* Group states by US region. Grouping states in the same region can allow us to 
analyze broader trends true across the nation without needing to compare every 
pair of 50 states. We add a region column to the dataset. 


* Calculate yearly average percent for each question by year and region. Taking the 
average over all instances of the question for a given year and region incorporates 
the answer to such questions across various demographics by race and gender. 


• Remove entries with missing data and convert 'YearStart' to datetime format for 
time based analysis. 


11 


2.2.3 Тіте Series Analysis 


For each pair of questions related to obesity, physical activity, апа nutrition, we plot the 
results for adults and children, respectively. 


Percent of adults aged 18 years and older who have obesity 


— Midwest 
— South 
— West 
— East 


Ф 


У 


9 
+ 


Үеаг 


Percent of students in grades 9-12 who have obesity 


è %, 
o 
ҹә 
чу 
“ 

È, 

è, 

э 
9 


Үеаг 


It is clear there аге steady upward trends for both student and adult obesity. These trends 
suggest a broader increase in indulgent lifestyles leading to obesity. 


Percent of adults who engage in no leisure-time physical activity 


N 


> ^ a ay 
$ $ $ $ $ $ $ $ $ $ $ я 


12 


Percent ої students in grades 9-12 who achieve 1 hour or more of moderate-and/or vigorous-intensity physical activity daily 


— Midwest 
— South 
— west 
— Ent 


Percent 


While adult physical activity appears to oscillate around a stagnant mean, student physical 
activity appears to have peaked in the early 2010s and been on a steady decline for every 
region since. This data suggests that both students and adults have become increasingly 
sedentary in recent years and reflects a broader inactive lifestyle. 


Percent of adults who report consuming vegetables less than one time daily 


Percent 


$ $^ $ r4 % 
Үеаг 
Percent ої students in grades 9-12 who consume vegetables less than 1 time daily 
500 —— Midwest 
— South 
475 — West 
— fast 


Регсепї 


С $ > $ > c > є © 4 % 2 
4 $ Я g 4 se 4$ 4$ 5$ ы g су ү g £^ S я я $ 
Үеаг 


It is clear from the nutrition data that for both adults and students, there is a steady 
upward trend in the percent of Americans who are not consuming their basic daily nutrition 
needs. One would expect there to be very strong positive or negative correlations between 


the various data. Indeed, we can analyze the relationship between these different features 
in the following correlation matrix. 


13 


Correlation Matrix of Average Data Values Across Questions 


1.00 


0.75 


0.50 


0.25 


0.00 


-0.25 


-0.50 


-1.00 


Ф d Ф Ф Ф Ф 


Q1: Percent ої adults aged 18 years апа older who have obesity 

Q2: Percent of students in grades 9-12 who have obesity 

ОЗ: Percent of adults who engage іп no leisure-time physical activity 

Q4: Percent of students in grades 9-12 who achieve 1 hour or more of moderate-and/or vigorous-intensity physical activity daily 
Q5: Percent of adults who report consuming vegetables less than one time daily 

Q6: Percent of students in grades 9-12 who consume vegetables less than 1 time daily 


Indeed, we see (Q1, Q2) and (Q5, Q6) possess very strong correlations and (Q3, Q4) 
possess very strong negative correlations, verifying these increasingly unhealthy lifestyle 
trends are occurring for both students and adults—note that Q4 is the percentage of stu- 
dents who do exercise and Q3 is the percent of adults who don't exercise, so one would ex- 
pect a negative correlation if both groups are following an increasingly sedentary lifestyle 
trend. 


Strikingly, we see that among adults there is a strong correlation between Q1 (obesity 
rate) and Q5 (lack of vegetable consumption), but there is almost no correlation between 
Q1 (obesity rate) and Q3 (physical activity). These relationships suggest the american 
obesity epidemic can likely be better analyzed with trends in food consumption rather 
than physical activity. Hence, much of our analysis will focus on indulgent consumption. 


14 


2.3 Meat Production 
2.3.1 Overview 


We analyze four datasets related to meat production in the United States. The four 
datasets detail amount of meat in storage (millions of pounds), being produced (millions 
of pounds), slaughtered (thousands of heads), and weight slaughtered (average weight in 
pounds) for various types of meat including beef, pork, chicken, lamb, and more. 


Higher meat consumption is conventionally associated with increased standard of living 
and indulgent behavior. By analyzing statistical trends in US meat production, we can 
better assess the correlation of these behaviors with other trends such as economic or 
physical well-being. 


2.3.2 Data Processing 
We clean the datasets by: 


* Filtering data from 2004 onwards. This ensures consistency across the datasets, as 
some datasets (namely the meat production dataset) is missing most or all of its 
data pre-2000. Furthermore, analyzing data from the last 20 years would provide a 
more accurate picture to current trends and also lie in the same time frame as other 
non-meat datasets. 


* Aggregating Commercial and Federally inspected meat. In terms of analyzing con- 
sumer preferences and financial/lifestyle costs, whether the meat is commercial or 
federally inspected is unlikely to play a significant role. Hence, we sum the unit 
values over both types. 


* Handling missing values. To handle missing values, we forward fill and replace re- 
maining NaNs with zeros. This way, we preserve better continuity in the time series 
data. 


* Convert string representations of numerical values to floats. Unit values are repre- 
sented in string format as opposed to numerical values, so we make the conversion 
to be able to run data analysis techniques. 


* Converting date strings to datetime objects. This allows us to analyze the time series 
data. 
2.3.3 Time Series Analysis 


We plot the storage, production, count, and weight of meat produce from 2004 onwards 
on a log-scale plot. In addition, we plot the aggregate count over all meat. 


15 


Weight 


Count 


10° 


102 


10! 


10° 


10° 


10° 


10° 


103 


102 


10! 


Storage over Time by Animal and Total 


УМ In een УУУУ INS 


Jw А оуу 
ч 


VA ^ AU гы 
AAV SINCE E NV 
ИХ) Ò 


А ba AT 


Date 
Counts over Time by Animal and Total 


NAM e SPP ANA ANTANI 


So Uu v ae yi ^c | 
н ууулу УГУ 

A vel T Ves Үкү Мм» 

9, АУЛА ДАДА mal \ = 

VV enn ү МУМ ANA AM у^ Mew 
> Ф у © N » 
$ $ У Š $ $ 

P P T T P P 

Date 


II 


Animal 
Beef 
Broiler 
Frozen Eggs 
Lamb and Mutton 
Other Chicken 
Pork 
Turkey 
Veal 
Total 


Animal 
Barrows and Gilts 
Beef Cows 

Boars and Stags 
Broilers 

Bulls and Stags 
Calves 

Cattle 

Dairy Cows 

Heifers 

Hogs 

Lambs and Yearlings 
Mature Sheep 

Other chickens 
Sheep and Lambs 
Sows 

Steers 

Turkeys 

Total 


Production 


Weight 


10? 


10! 


10? 


102 


Production over Time by Animal and Total 


^ t WI wen 


Mb WA ANA лыр NY APA 


Wa Peer 
YAMA PP rS Wa wen Wn sly 


wh v^ 


Date 
Weights over Time by Animal and Total 


PP eS Pre АМАА МАТЕМ PS Pe 


e 
S Я я rd rd 4 


ШЙ 


Animal 
Beef 
Broilers 
Lamb and Mutton 
Other Chicken 
Pork 
Turkey 
Veal 
Total 


Animal 
Broilers 

Bulls and Stags 
Calves 

Cattle 

Cows 


— Heifers 


Hogs 
Other Chickens 
Sheep and Lambs 


— Steers 


Turkeys 
Total 


Analyzing these plots, we see for the most part meat production metrics show a generally 
constant and stable trend over time, with clear seasonality in the data (particularly with 
meats such as Turkeys whose storage peaks may cycle around Thanksgiving). In addition, 
the production plot shows a slow aggregate growth in total production in pounds. This indi- 
cates that, despite short term fluctuations, all metrics point to a stable and slowly growing 
meat industry over the past two decades. 


This slow and steady growth could suggest an increasing demand for meat products, which 


could have implications for dietary preferences and socioeconomic trends. 


Generally, 


meat is considered a normal good, and one might expect patterns in meat consumption 


and production to follow increases in health issues or consumer purchasing power. 


2.4 Sugar and Coffee Commodities 


2.4.1 Overview 


We analyze a dataset of monthly commodity prices (cents per pound) for both sugar and 
coffee from 1990 - 2024. Coffee and sugar prices can provide insight into demand or sup- 
ply of these commodities, possibly revealing insight into trends of consumer preferences 


or economic health. 


16 


Cents рег Pound 


2.4.2 Data Processing 


Little processing was needed to clean the data for analysis purposes. Date columns were 
converted to datetime objects to plot as time series data, and missing value rows were 
dropped (which were simply all data before 1990). 


2.4.3 Time Series Analysis 


We plot on a log-scale of the commodity prices for sugar and coffee every month. 


Coffee and Sugar Commodity Prices Over Time 


— Coffee 
— Sugar 


% 


o 9% 
$ $ $ $ 


o, 


Note that we similar trends in both increases and decreases of commodity prices for both 
goods. For instance, prices seem to drop near the early 2000s and steadily grow until 
2024. This loose positive correlation is analyzed further in Figure 4.3.1). 


2.5 Stocks and ETFs 
2.5.1 Overview 


The stocks and ETF dataset contains financial market data for various ticker symbols, 
including both individual stocks and exchange-traded funds (ETFs). The individual stocks 
are all involved with food and produce in some capacity. The dataset provides historical 
price and volume information for each security, allowing for analysis of market trends, 
performance, and volatility across different industries and time periods. 


2.5.2 Data Processing 
To preprocess and clean our dataset, we perform the following: 


* Group stocks together by industry. We seek to reduce individual volatility from stock 
to stock by analyzing each industry's performance as a whole. 


17 


* Calculate yearly return rate for each stock. To better compare stock performance 
between stocks of various market caps, we analyze 96 return as a standardized met- 
ric. 


* Compute weighted average returns by industry. To assess the performance of each 
industry, we use a weighted average of the returns of each stock in the industry, 
weighing each return proportional to the market cap of the stock. 


* Convert Date-Time into datetime objects for time series plotting. 


2.5.3 Summary Statistics 


We plot the mean and standard deviation of each stock's mean closing price over the 24 
year span. This gives a sense of the volatility and change of stock prices over the time 
period, for stocks with higher standard deviations suggest bigger changes in price. 


Mean Close Price with Standard Deviation by Ticker Symbol and Industry 


1200 ШШШ RETAIL-EATING PLACES 
її MULTIPLE 
ШШШ FARM MACHINERY & EQUIPMENT 

POULTRY SLAUGHTERING AND PROCESSING 
ШЕШ CONSTRUCTION MACHINERY & EQUIP 

RETAIL-EATING & DRINKING PLACES 

BEVERAGES 
1000 ШЕШ FATS & OILS 

SUGAR & CONFECTIONERY PRODUCTS 
ШШШ SERVICES-PREPACKAGED SOFTWARE 

MEAT PACKING PLANTS 

| FABRICATED STRUCTURAL METAL PRODUCTS 

FOOD AND KINDRED PRODUCTS 
WHOLESALE-GROCERIES & RELATED PRODUCTS 
GRAIN MILL PRODUCTS 
RETAIL-BUILDING MATERIALS, HARDWARE, GARDEN SUPPLY 
BOTTLED & CANNED SOFT DRINKS & CARBONATED WATERS 


800 


614.71 


Mean Close Price 


400 


226.77 
194.67 
171.17 


150.89 151.23 146.36 
12308 104.48 
10827 100.74 jE 9188 87/16 рз 
12185 

6533 6190 6.96 5567 5115 als ES Pep 3586 aspa 
3 2815 

18,86 | 
1 єй 1474 1107 | Es T 


CDSS Я о А a CY LC OF ж, lE Ù CNS AB 
o" SL d TL у d SG ew. SK KS e 


200 


Ticker Symbol 


2.5.4 Time Series Analysis 


We plot the weighted yearly returns for each industry and plot the time series data over 
the time span. We indicate years where a majority of industries yielded positive growth 
with green vertical lines and years where a majority yielded negative growth returns with 
red vertical lines. 


18 


Aggregate Yearly % Return 


2. Aggregate Yearly % Return by Industry Over Time 
| SS NS И UNS ТЕ ИНГ ЧЫ ЗЕТЕ ИНЕ ME ОЕ —— BEVERAGES 
! ! ! ' ! ! ] і ! ' —— BOTTLED & CANNED SOFT DRINKS & CARBONATED WATERS 
— CONSTRUCTION MACHINERY & EQUIP 
— FABRICATED STRUCTURAL METAL PRODUCTS 
125 — FARM MACHINERY & EQUIPMENT 
— FATS 8 OILS 
— FOOD AND KINDRED PRODUCTS 
— GRAIN MILL PRODUCTS 
1.00 —— MEAT PACKING PLANTS 
— MULTIPLE 
—— POULTRY SLAUGHTERING AND PROCESSING 
— RETAIL-BUILDING MATERIALS, HARDWARE, GARDEN SUPPLY 
—— RETAIL-EATING & DRINKING PLACES 
0.75 — RETAIL-EATING PLACES 
— SERVICES-PREPACKAGED SOFTWARE 
— SUGAR & CONFECTIONERY PRODUCTS 
— WHOLESALE-GROCERIES & RELATED PRODUCTS 
0.50 - ! 
| | ANA 
0.25 N | INC NA NAA | (NYC 
IN NI ) чб) Уч IN AM ; 
y WA NV XI ZAN А, 
| DA M N \ | YN W АЖ Waites KO. NS 
АКИ АН VT NA 
0.00 U |i NA. NG А КОЙКА INA 
Ry ТАЛМА УКЫ | ХИХ j 
SY | NA i КУ 4; АУЛ 
4 Low ү 
-0.25 | 
=0.50 
2000 - 2004 2008 _ 20122 . 206. 2020 ^ 2024 


As one can see, with the exception of 2008-2009, a time of financial recession, food stocks 
yielded a majority positive returns every year. This suggests a growing food industry and 
can reveal trends in consumer preferences and lifestyle changes. 


2.6 Cross-Dataset Correlations 


We analyze the correlations between different features across our datasets. Namely, we 
correlate indicators of indulgent lifestyle, such as meat production and obesity rate, with 
economic indicators such as poverty rate and unemployment rate. 


19 


Correlation Matrix ої Obesity, Poverty, and Meat Production (After 2011) 


1.00 


Meat Production 


Obesity Rate 


-0.25 


Poverty Rate 


-0.50 


-0.75 


Unemployment Rate 


-1.00 


Meat Production Obesity Rate Poverty Rate Unemployment Rate 


As we see from the correlation matrix, there are strong positive correlations between un- 
employment rate and poverty rate, along with between meat production and obesity rate. 
This may be expected, as increases in either unemployment or poverty may be associated 
with financial distress; increases in meat production or obesity may indicate more indul- 
gent lifestyles. 


On the other hand, there is a strong negative correlation between meat production / obe- 
sity rate and poverty rate / unemployment. In other words, when the nation is in financial 
distress, indulgent and unhealthy lifestyles associated with obesity and meat production 
decrease. This suggests that as the nation grows economically, it is expected this indul- 
gent and unhealthy lifestyle will grow too. 


20 


З Health 


The indulgent food culture in America is also reflected in its significant impacts on public 
health economics. This analysis utilizes external datasets [2]: to explore the healthcare 
costs attributed to food-related diseases, examining the relationship between increased 
food production and medical expenditures. By identifying correlations between food in- 
dustry practices and health-related costs, we aim to quantify the economic burden on the 
public due to health-related reasons. 


3.1 Diabetes 


High in calories, sodium, and fats, junk food can raise levels of triglycerides, a fat present 
in the blood, increasing the risk of developing type 2 diabetes. In addition, highly pro- 
cessed foods can lead to a significant increase in blood sugar over a short amount of time, 
similar to blood sugar levels of those diagnosed with type 2 diabetes. Thus, the American 
unhealthy lifestyle is closely linked to diabetes. 


3.1.1 Time Series Analysis 


Plotting the percentage of the US population diagnosed with diabetes against time from 
1980-2017, we see an overall increase in diabetes diagnoses in the United States. 


Time Series Analysis of Diabetes Diagnoses 


—— Percentage 
=-=- Age-Adjusted Percentage 


Percentage (96) 


In 1980, only 2.596 of the US population was diagnosed with diabetes, and 2.8% after 
adjusting for age. Over the course of 37 years, the diabetes rate increased by 18896, to a 
total of 7.296 and 6.396 when adjusted for age. The rate of diabetes may be greater than 
recorded as, according to the CDC, many go undiagnosed. With the increase in indulgent 
food culture and easier access to junk foods, it is expected that the rate of diabetes will 
continue to grow. 


21 


3.1.2 Diabetes апа Sugar Prices 


Comparison of Sugar Prices, Diabetes Percentage, and Age-Adjusted Diabetes Percentage 


25.0 7 
6 
22.5 
20.0 б 
Ф 
S 
E $ [> 
© 17.5 g 
з 58 
5 д 4 
Ф > 
3 15.0 E 
T 
a 4 
12:5 4 
10.0 3 
3 
7:5 а 
1990 1995 2000 2005 2010 2015 
Year 


Graphing sugar prices against the percentage of the American population with diabetes, 
there was a correlation coefficient of 0.63, meaning there is a positive correlation between 
sugar prices and diabetes. 


3.1.3 Diabetes and Meat Production 


Correlation Matrix of Meat Production and Diabetes Percentage 


= a 


Beef 


Broilers 


ш га іа |- - | 
TN em HB Во 


0.50 


- 0.25 


- 0.00 


--0.25 


-0.50 


-0.75 


Isted Percentage 


Consumption of more than one serving of meat a day can lead to higher risks of Type 2 di- 


Yea 
Beef 
Broilers 
and Mutton 
yer Chicken 
Percentage 
Percentage 


22 


abetes. Examining the correlation coefficients between diabetes and meat production, we 
see an overall positive correlation between diabetes and various animal meat. Although 
looking at ‘Lamb and Muttion’ and ‘Veal’ specifically, we see a negative correlation, this is 
due to the close to negligible consumption of these two meats as compared to the more 
popular meats, such as beef and pork. 


3.1.4 Cost of Diabetes 


Taking data from an external source, we graph the average annual spending of individuals 
with health coverage from a large company against time (2003-2017) to see trends in all 
enrollees, those with diabetes with complications, and those with diabetes without com- 
plications. 


Average Annual Spending of Individuals with Health Coverage 


22500 All enrollees 
All enrollees Trend 
20000 Diabetes with complications 
Diabetes with complications Trend 
17500 Diabetes without complication 
Diabetes without complication Trend 
л 15000 
о 
о 
© 12500 
E 
5 
« 10000 
7500 
5000 
2500 


2004 2006 2008 2010 2012 2014 2016 
Year 


It is evident from the graph that medical costs overall have been increasing steadily from 
the year 2003 to 2017. Steadily increasing medical costs can be attributed to a vari- 
ety of factors, often intertwined, spanning economic, demographic, technological, and 
policy-related domains. One possible reason could be due to advancements in medical 
technology. New drugs, medical devices, and cutting-edge treatments typically require 
substantial research and development investment, which can drive up healthcare prices. 


Using Ordinary Least Squared Regression, all enrollees’ annual average medical cost in- 
creases at a rate of 5202.9821 a year, diabetes without complications increases at $359.4929 
a year, and diabetes with complications increases at $694.2036 a year. Each line had a 
least-squared value of 0.994, 0.972, and 0.994 respectively, indicating a close and accu- 
rate fit. Thus, this shows that though medical costs increase steadily for everyone, the 


23 


medical cost of those diagnosed with diabetes with complications is growing at a rate 
more than triple that of those without diabetes. Even individuals diagnosed with diabetes 
without any complications see an increase in annual average medical costs almost double 


that of enrollees without diabetes. 


24 


4 Economics 


The American indulgent food culture is a key driver of many economic indicators, such as 
major stock prices and unemployment rates. In this section, we validate both significant 
and insignificant correlations, as well as potential causal relationships. 


4.1 Stocks/ETFs and Indulgent Food Econometrics 


We utilized the Meat Production, the Sugar and Coffee Commodities, and the Stocks and 
ETFs datasets to conduct the analysis on the predictive power of food production/commodity 
prices on relevant stock prices. 


4.1.1 Signal Preprocessing 


To account for short term noises and the major COVID-19 pandemic, we used Fast Fourier 
Transform to smooth the stock signals, meat productions, and coffee/sugar prices time se- 
ries. We conducted correlation analysis from the smoothed data as to capture the general 
trend without short term fluctuations. We used raw data for causal inferences, production 
and prices moved together with stock prices during the pandemic, experiencing a sharp 
decline. 


The following graph shows several major food stocks after smoothing. 


Closing Prices of Selected Major Stocks Over Time 


SBUX 
SBUX Smoothed 
HSY 

HSY Smoothed 
PEP 

PEP Smoothed 
QSR 

QSR Smoothed 
MCD 

MCD Smoothed 
YUM 

YUM Smoothed 


300 4 


250 


200 4 


ИИИНИН 


150 4 


Closing Price 


100 4 


т т т т т т 
2000 2004 2008 2012 2016 2020 2024 


4.1.2 Meat Production 


Using regular expressions, we determined the following stocks that are highly associated 
with meat culinary: QSR, CAG, HRL, DPZ, CMG, DRI, MCD, PPC, YUM, and WEN. 


25 


After FFT, we produced the following correlations with meat production since the first doc- 
umented stock prices, as per monthly data if available: 


| QSR | CAG | HRL | DPZ | СМС | DRI | MCD | PPC | YUM | WEN 
Correlation | 0.72 | 0.54 | 0.72 | 0.73 | 0.66 | 0.86 | 0.84 | 0.28 | 0.80 | 0.12 


We observe strong, positive linear correlation between stock prices and meat production, 
with the exceptions of Pilgrims Pride Corp and Wendy’s. In general, as food production 
grows, food stock prices also increases. 


As reported in the EDA, the food production yields increased steadily after 2004. 


Meat Production Over Time (Million Pounds) 


160000 4 


140000 4 
120000 4 
100000 4 
80000 4 
60000 4 
Type of Meat 
EH Beef 
40000 mw Broilers 
NEN Lamb and Mutton 
ВЕЕ Other Chicken 
20000 | ШШШ Pork 
EE Turkey 
mm veal 
о -mpa — 
3 © 
9 я 


Year 


Production (Million Pounds) 


2006 | 


2007 
2008 
2009 
2010 


2011 
2012 
2013 
2014 
2015 
2016 
2017 
2018 
2019 
2020 
2021 
2022 
2023 


Due to potential confounding variables such as population growth and inflation, both stock 
prices and meat production has grown over the last two decades. We used the Johansen 
cointegration test to determine whether stock prices and meat production are joint sta- 
tionary time series. As a result, for both red meat and poultry, and for all selected stocks, 
the two time series are highly cointegrated at a confidence level of 95%. 


Hence, we performed the Granger Causality Test for red meat and poultry respectively 
to further determine the predictive power of meat production to the relevant food stock 
prices. In particular, we stationarized both time series through differencing (i.e., taking 
the immediate estimate of derivative) and logarithms to make constant of the means and 
the standard deviations. 


26 


Processed Closing Prices of YUM Over Time Processed Meat Production Over Time 


8 ee e aH n t NDE AL A AAA, 


| —— YUM Prices 
—— YUM Stationarized 


—— Meat Production 
—— Production Stationarized 


Price 
> 


N 


Production (Million Pounds) 


T T T T T T т 


2000 2004 2008 2012 2016 2020 2024 2000 2004 2008 2012 2016 2020 2024 
Date Date 

We discovered statistically significant Granger’s causal relationships at a P-value of 0.05 
between meat production and the prices for various stocks at a lag of varying number of 
months. The following table displays an "Y" if all tests of significance passed at a delay 
of i months, а "М" if no tests passed, and a "P" if partial tests passed. We ran four tests 
of inferences - the Sum of Squared Residuals F-test, the Sum of Squared Residuals Chi- 
squared test, the Likelihood Ratio Test, and the Parameter F-test. 


Statistical Significance for Red Meat Production on Stock Prices 


Lag (Months) | CAG | HRL | YUM 
1 Y Y N 
2 N Y N 
3 N Y N 
4 N Y Y 
5 N Y Y 


Statistical Significance for Poultry Production on Stock Prices 


Lag (Months) | CAG | HRL | YUM | WEN 
1 Ү Ү М М 
2 М Ү М М 
3 N N P Y 
4 N N Y Y 
5 N N Y Ү 


The process of meat production, delivery, storage, апа cooking is а streamline that leads 
to varying delays for different indulgent food companies. 


CAG and HRL seemed to have an immediate response in price to both red meat and poul- 
try supplies. This can be inferred from their significant results for a lag of 1 or 2 months 
after meat production, but the results was eroded away with time after 2 months of a fluc- 
tuation in meat production. As reflected by both red meat and poultry production, YUM 
experiences a delay of around 4 months. On the other hand, WEN's stock prices seemed 
to be more associated with poultry production, with a lag of around 3 months. 


27 


4.1.3 Coffee апа Sugar Prices 


Relevant stocks were selected using regular expression to be SBUX, HSY, ADM, KDP, PEP, 
MNST, and COKE. 


We conducted similar preprocessing to smooth the signals before computing correlations 
between coffee/sugar prices and relevant stock prices over time. The correlations com- 
puted are moderate and positive. 


Correlation between Selected Stocks and Coffee Prices 
SBUX | HSY | ADM | PEP | COKE 


Correlation | 0.33 | 0.22 | 0.70 | 0.54 | 0.38 


Correlation between Selected Stocks and Sugar Prices 
SBUX | HSY | ADM | PEP | COKE 
Correlation | 0.23 | 0.11 | 0.59 | 0.42 | 0.25 


Particularly, sugar prices seemed to have a weaker correlation with selected stock prices 
than coffee prices. ADM’s stock prices appear to have the strongest correlation with both 
coffee and sugar production out of the five stocks. On the opposite side, HSY has the 
weakest correlation with both commodities. No Johansen tests yielded significant results 
for cointegration between any stock signal and coffee or sugar prices. 


To examine if there are any significant causal relationships, we performed the Granger 
Causality test after stationarizing both commodity prices and stock prices. The Granger 
Causality test yielded only statistically significant results for PEP at a lag of 1 or 2 month(s), 
with all four statistical tests yielding less than 0.05 P-value for both lags. We infer that PEP 
has an immediate response to coffee prices for one to two months after a fluctuation in 
commodity prices. 


We trained a LSTM model for PEP stocks to validate its immediate response to coffee price 
in 1 to 2 months. The LSTM learned key features of commodity prices and their relationship 
to stock prices. Then, we generated synthetic impulses on coffee prices and simulated the 
effect of impulses with different magnitudes on PEP stock prices. 


28 


IRF with ап impulse of 3 IRF with an impulse of 5 


9.38 9.38 4 
9.37 9.37 4 
o Ф 
л л 
© © 
2 9.36 2 9.36 - 
È È 
9.35 9.354 
9.34 9.344 
T T T T T T T T T T T T T T T T 
00 25 50 7.5 10.0 12.5 15.0 17.5 00 2.5 5.0 7.5 10.0 12.5 15.0 17.5 
Steps Steps 
IRF with an impulse of 7 IRF with an impulse of 10 
9.38 4 
9.38 
9.37 - 
9.37 
Ф Ф 
n л 
2 9.36 2. 
È È 
9.34 9.34 4 
T T T T T T T T T T T T T T T T 
00 25 50 75 100 125 15.0 17.5 00 25 50 75 100 125 150 17.5 
Steps Steps 


As a result, the responses of PEP to our synthetic impulses align with the results from 
Granger Causality test - PEP stock price stabilizes after one to two months without further 
increases or fluctuations. 


In general, coffee/sugar commodity prices appear to have low predictive power to other 
stock prices of beverage companies’ stocks apart from PEP, as opposed to meat produc- 
tion’s influence on stock prices of meat-relevant companies’ stocks. 


4.2 Econometrics and Business Cycles 


To understand how unemployment rates and business cycles fluctuate with meat produc- 
tion and coffee/sugar commodity prices, we utilized the Cross Correlation Function (CCF) 
and an IRF convolution impact simulation to understand their relationship. In particular, 
we mainly studied the unemployment rates. 


We used yearly data in this section. Therefore, all units of lags are in years. 


4.2.1 Meat Production 


We smoothed both the unemployment rate and the meat production series before com- 
puting the CCF between unemployment rates and meat production by geolocation with 


29 


a maximum lag of 20 years. То compute the CCF, we used the differenced logarithmic 
unemployment rate and meat production to filter out the nuisance. 


30 


CCF between Unemployment and Red Meat Production by Region 


Midwest Cross-Correlation Function South Cross-Correlation Function 
0.3 d 
0.3 4 
0.2 
0.24 
0.14 
m ш 0143 
9 о 
0.0 4 
0.0 - 
—0.1 4 -01 
-0.24 —0.2 4 
r r - - - - - - - r r - - - - r r - 
-20 -15  -10 -5 0 5 10 15 20 -20 -15  -10 -5 0 5 10 15 20 
Lag Lag 
West Cross-Correlation Function East Cross-Correlation Function 
0.34 
0.3 | 
0.2 - 
0.2 | 
0.1 - 
5 014 5 
о o 
0.04 
0.0 4 
—0.1 4 
—0.1 4 
-0.2 
r r r : : : i - r : : r r : : z : r 
-20 -15 -10 -5 0 5 10 15 20 -20 -15  -10 -5 0 5 10 15 20 
Lag Lag 


The correlations appear to have a sinuous distribution. Across the regions, there appears 
to be a consistent trend for unemployment rate and red meat production to have moder- 
ately weak, positive correlation without a lag, but weak, negative correlation with a lag of 
around —6 years. 


Unemployment rate is an indicator of business cycle. High unemployment rates indicate a 
recession, and low unemployment rates indicate an expansion of the economy. Our anal- 
ysis shows a lag of around 6 years across the four regions of the United States for the food 
production market, as well as its chained secondary and tertiary markets such as manu- 
facturing and retails, to grow to its free market capacity. The following consequence was 
for the market self-adjusts to the overflow of workers, causing the market to shrink. 


Due to other delay factors such as meat cold storage and transportation, provided to have 
intermediate increases during expansions, as unemployment rate reached a local max- 


ima, meat production also peaked. 


Vice versa, the similar logic can be reasoned for contraction phases of the economy. 


31 


CCF between Unemployment and Poultry Production by Region 


Midwest Cross-Correlation Function South Cross-Correlation Function 
0.15 4 
0.05 4 
0.10 4 
0.00 
0.05 4 
иц. ц. 
о o 
o o —0.05 4 
0.00 
—0.05 4 293041 
—0.10 + 
—0.15 4 
Y T T T T T T T T T T T T T T T T T 
-20 -15 -10 -5 0 5 10 15 20 -20 -15 -10 -5 0 5 10 15 20 
Lag Lag 
West Cross-Correlation Function East Cross-Correlation Function 
0.05 4 
0.4 4 
0.00 4 $5.1 
Е „ 024 
-0.05 4 
8 8 
0.1 - 
—0.10 4 
0.0 - 
—0.15 + —0.1 - 
- r : r r - - : r - r r r - - : : r 
-20 -15 -10 -5 0 5 10 15 20 -20 -15 -10 -5 0 5 10 15 20 
Lag Lag 


For poultry, there appears to be a less clear trend across the four regions. This can be 
attributed to different dining cultures that has varying emphasis on poultry consumption 
in different areas. Midwest, West, and East U.S. statistics also did not align with the dis- 
tribution of red meat correlations, with the correlation reaching the bottom at immediate 
examination of the two time series without any lags. This can be attributed to a lack of 
influence of the poultry production to these regions. 


4.2.2 Coffee Prices 


The raw and smoothed coffee and sugar prices over the provided timeframe exhibits busi- 
ness cycle trends, particularly coffee prices. Coffee prices show a general upward trend 
and clear economic expansion and contraction influences. 


We used the mathematical model 
P(t) = Bosin(Bit + 83) + aot + € 
to identify business cycles, with account of the inflation factor aot + e. P(t) is a function 


that outputs coffee prices as a function time. We used the ordinary mean square error 
minimization optimization algorithm. 


32 


Coffee Prices in Business Cycles 


зоо 4 —— Coffee Prices 
—— Smoothed Coffee Prices 
—— Fitted Model 


250 


N 
о 
о 


150 


Commodity Price 


100 


50 


1992 1996 2000 2004 2008 2012 2016 2020 2024 
Date 


The fitted model was 
P(t) = 46.28 sin(0.00115t — 0.95) + 0.0106t + 73.07. 


Qualitatively, we identified two main recessions from our model: the Early 2000 recession 
and the COVID-19 recession. The Early 2000 recession disrupted the global supply chain, 
affecting coffee imports and exports. On the other hand, the COVID-19 recession severely 
impacted coffee commodity prices due to lockdown and the work-from-home tide, casting 
a negative shadow over the coffee beverage companies. 


The Great Recession, though lengthy and famous, didn’t quite impact coffee prices as 
coffee had relatively inelastic demand, which indicates steady prices if not altered by 
anomalies in supply or demand chains. 


4.3 Inflation and Implicit Costs 
4.3.1 Correlation of Commodity Consumption and Consumer Basket 


As demonstrated earlier in the Exploratory Data Analysis, we analyzed the correlations be- 
tween different features across our datasets, such as economic indicators and indulgent 
lifestyle indicators. 


The 0.56 correlation between coffee prices and obesity rate suggests a moderate positive 
correlation between the two variables, and we can determine that coffee prices increase 
when the obesity rate increases as well. Looking at the correlation between sugar prices 
and economic indicators, we notice that there is a 0.44 correlation between sugar prices 
and unemployment and a 0.15 correlation between sugar prices and the poverty rate. 


In addition, there is a -0.96 correlation between meat production and poverty rate and 


33 


а -0.97 correlation between meat production and unemployment rate, which shows that 
there is a strong negative correlation between meat production and both poverty and 
unemployment rates. These correlations demonstrate that when the US is growing eco- 
nomically, there is increased prices of sugar and coffee and decreased prices of meat. 


Correlation Matrix of Various Features (After 2011) 


Meat Production Sugar Prices Coffee Prices 


Obesity Rate 


-0.25 


-0.50 


-0.75 


Unemployment Rate Poverty Rate 


-1.00 
Coffee Prices Sugar Prices Meat Production Obesity Rate Poverty Rate Unemployment Rate 


Figure 1: Correlation matrix of various features 


Thus, we can conclude that there may be a relationship between the economic indicators 
and the consumption of commodities and indulgent foods. Specifically, with a growing 
economy, there is increased prices of sugar and coffee, which may be due to increased de- 
mand for these goods, and a decreased price of meat possibly due to decreased demand. 
This provides an opposing perspective, as it may be expected that a growing economy 
results in more indulgent behavior which is associated to obesity. 


4.4 Geolocation 


As discussed in the EDA, we separated America into four different regions (West, Midwest, 
Northeast, and South) in order to investigate if there was a relationship between the eco- 
nomic state of a region and its obesity rate, as states in similar regions may have similar 
characteristics. 


34 


For each region, we analyze the correlations between obesity rate and the economic іп- 
dicators from our datasets. While analyzing these correlation matrices, it is important to 
consider that the unemployment rate should not be analyzed with the student obesity rate 
as students are most likely unemployed, which may not be reflected in the data. 


First, we look at the correlations for the East, which shows a -0.22 correlation between 
unemployment and adult obesity rate and a -0.19 correlation between median household 
income and student obesity rate. Both of these correlations show slight negative correla- 
tion. This demonstrates that adults might have higher obesity rates when there is more 
unemployment, and students may have higher obesity rates when in households with 
lower income. 


Correlation Matrix for East 
к 1.00 


Poverty Rate i 0.38 0.082 0.17 
0.75 


0.50 


Median Household Income 


Unemployment Rate - 


- —0.25 


Adult Obesity Rate - 


—0.50 


—0.75 
Student Obesity Rate - 


—1.00 


Poverty Rate - 

Median Household Income - 

Unemployment Rate - 

Adult Obesity Rate - 
Student Obesity Rate 


Next, we analyze the correlations for the Midwest, where we notice a 0.39 correlation 
between median household income and student obesity rate and a 0.17 correlation adult 
obesity rate and median household income. These positive correlations contradict the 
correlations from the East. 


35 


Correlation Matrix for Midwest 


Poverty Rate 


1.00 

0.75 

0.50 
Median Household Income 


- 0.25 


Unemployment Rate - 0.00 


-0.25 
Adult Obesity Rate - -0.056 . hi Л 
—0.50 
È —0.75 
Student Obesity Rate . 
-1.00 


Looking at the South, we observe а 0.42 correlation between poverty rate and student 
obesity rate and a -0.23 correlation between unemployment rate and adult obesity rate. 
This suggests that there is increased poverty when there is greater obesity rates in stu- 
dents, and a decreased unemployment rate may correlate with a higher adult obesity 
rate. 


Poverty Rate 
Median Household Income 
Unemployment Rate 
Adult Obesity Rate - 
Student Obesity Rate 


Correlation Matrix for South 
1.00 


Poverty Rate 


Median Household Income 


Unemployment Rate 


-0.25 


Adult Obesity Rate - 
—0.50 


А —0.75 
Student Obesity Rate 


—1.00 


Poverty Rate 
Median Household Income 
Unemployment Rate 

Adult Obesity Rate - 
Student Obesity Rate 


Finally, we analyze the West’s correlation matrix. Most correlation’s absolute values are 
not high to demonstrate a relationship between economic indicators and obesity rates, but 
it can be noted that there is a 0.29 correlation between poverty rate and student obesity 
rate. 


36 


Correlation Matrix for West 
1.00 


Poverty Rate 0.1 0.29 


0.75 


0.50 


Median Household Income -0.041 -0.076 


- 0.25 


Unemployment Rate - 0.00 


- -0.25 


Adult Obesity Rate - 
-0.50 


5 —0.75 
Student Obesity Rate - 


-1.00 


Poverty Rate - 

Median Household Income - 
Unemployment Rate 

Adult Obesity Rate - 
Student Obesity Rate 


Overall, there is different behaviors in each geographical region, but а common correlation 
is a positive correlation for student obesity rate and poverty rates, which may show that 
higher poverty may result in food insecurity and resorting to unhealthy foods. Additionally, 
higher unemployment rates are correlated to higher obesity rates among adults, so we 
could hypothesize that unemployment may increase indulgent food behaviors. 


37 


5 Conclusion 


5.1 Limitations and Future Directions 


A limitation of our analysis is the generalization of regions. Since the dataset grouped by 
states, this eliminated any opportunities to gain deeper nuances on how the living envi- 
ronment, such as urban vs rural environment or transportation modes, plays a role into 
health and consumption. 


Another possible limitation was the dataset for unemployment. Since the data only pro- 
vided yearly unemployment data, it was not possible to identify seasonal trends and ana- 
lyze smaller windows of time. Thus, we would recommend that future studies to explore 
the relationship between seasonal unemployment trends and indulgent food culture. 


For future analysis, we would analyze how other confounding variables could interact with 
health of US citizens. Although we were able to investigate several variables, such as veg- 
etables consumption, physical activity, and income, some additional factors that we were 
not able to investigate due to limited resources would include residential area, genetic 
predisposition, psychological stress, and media influence. 


5.2 Recommendation to the United States Government 


Our analysis underscores the significant economic and health ramifications of America’s 
indulgent food culture, particularly the consumption of high quantities of meat and sugar. 
A possible solution is for the US Government to give incentives to the production and 
consumption of healthier alternatives through subsidies and tax benefits for example. By 
adopting these recommendations, the US Government can mitigate any economic effects 
that might be tied to the Americans’ health as discussed in our findings. 


38 


References 


[1] Bureau, US Census. Geographic Levels — census.gov. 
programs - surveys / economic- census / guidance - geographies 


cessed 04-08-2024]. 


[2] Tracker, Peterson-KFF Health System. How have diabetes costs and outcomes changed 
over time in the U.S.? https ://www.healthsystemtracker.org/chart-collection 
diabetes-care-u-s-changed-time/. [Accessed 04-08-2024]. 


39 


